ACII情感声音爆发研讨会和竞争的重点是理解声乐爆发的多个情感维度:笑声,喘息,哭泣,尖叫声以及许多其他非语言声音,这是情感表达和人类交流的核心。今年的比赛包括四首曲目,使用1,702位扬声器的大规模和野外数据集提供59,299个发声。首先是A-VB高任务,要求竞争参与者使用十个类似的注释的情感表达强度,对情感进行新型模型进行多标签回归,包括:敬畏,恐惧和惊喜。第二个是A-VB-TWO任务,利用更传统的二维模型来进行情感,唤醒和价值。第三个是A-VB文化任务,要求参与者探索数据集的文化方面,培训本地国家依赖模型。最后,对于第四个任务,A-VB型,参与者应认识到声乐爆发的类型(例如,笑声,哭泣,咕unt)是8级分类。本文介绍了使用最先进的机器学习方法的四个轨道和基线系统。每条轨道的基线性能是通过使用端到端深度学习模型获得的,如下所示:对于A-VB-高,平均(超过10维)一致性相关系数(CCC)为0.5687 CCC为获得;对于A-VB-TWO,获得了0.5084的平均值(超过2维);对于A-VB培养物,从四个培养物中获得了0.4401的平均CCC;对于A-VB型,来自8类的基线未加权平均召回(UAR)为0.4172 UAR。
translated by 谷歌翻译
ICML表达性发声(EXVO)的竞争重点是理解和产生声音爆发:笑声,喘息,哭泣和其他非语言发声,这是情感表达和交流至关重要的。 EXVO 2022,包括三个竞赛曲目,使用来自1,702位扬声器的59,201个发声的大规模数据集。首先是Exvo-Multitask,要求参与者训练多任务模型,以识别声音爆发中表达的情绪和人口特征。第二个,即exvo生成,要求参与者训练一种生成模型,该模型产生声音爆发,传达了十种不同的情绪。第三个exvo-fewshot要求参与者利用少量的学习融合说话者身份来训练模型,以识别声音爆发传达的10种情感。本文描述了这三个曲目,并使用最先进的机器学习策略为基线模型提供了绩效指标。每个曲目的基线如下,对于exvo-multitask,一个组合得分,计算一致性相关系数的谐波平均值(CCC),未加权的平均召回(UAR)和反向平均绝对错误(MAE)(MAE)($ s_ {mtl) } $)充其量是0.335 $ s_ {mtl} $;对于exvo生成,我们报告了Fr \'Echet Inception距离(FID)的得分范围为4.81至8.27(取决于情绪),在训练集和生成的样品之间。然后,我们将倒置的FID与生成样品的感知评级($ s_ {gen} $)相结合,并获得0.174 $ s_ {gen} $;对于Exvo-Fewshot,获得平均CCC为0.444。
translated by 谷歌翻译
自动影响使用视觉提示的识别是对人类和机器之间完全互动的重要任务。可以在辅导系统和人机交互中找到应用程序。朝向该方向的关键步骤是面部特征提取。在本文中,我们提出了一个面部特征提取器模型,由Realey公司提供的野外和大规模收集的视频数据集培训。数据集由百万标记的框架组成,2,616万科目。随着时间信息对情绪识别域很重要,我们利用LSTM单元来捕获数据中的时间动态。为了展示我们预先训练的面部影响模型的有利性质,我们使用Recola数据库,并与当前的最先进的方法进行比较。我们的模型在一致的相关系数方面提供了最佳结果。
translated by 谷歌翻译
Mapping the seafloor with underwater imaging cameras is of significant importance for various applications including marine engineering, geology, geomorphology, archaeology and biology. For shallow waters, among the underwater imaging challenges, caustics i.e., the complex physical phenomena resulting from the projection of light rays being refracted by the wavy surface, is likely the most crucial one. Caustics is the main factor during underwater imaging campaigns that massively degrade image quality and affect severely any 2D mosaicking or 3D reconstruction of the seabed. In this work, we propose a novel method for correcting the radiometric effects of caustics on shallow underwater imagery. Contrary to the state-of-the-art, the developed method can handle seabed and riverbed of any anaglyph, correcting the images using real pixel information, thus, improving image matching and 3D reconstruction processes. In particular, the developed method employs deep learning architectures in order to classify image pixels to "non-caustics" and "caustics". Then, exploits the 3D geometry of the scene to achieve a pixel-wise correction, by transferring appropriate color values between the overlapping underwater images. Moreover, to fill the current gap, we have collected, annotated and structured a real-world caustic dataset, namely R-CAUSTIC, which is openly available. Overall, based on the experimental results and validation the developed methodology is quite promising in both detecting caustics and reconstructing their intensity.
translated by 谷歌翻译
Automatic fake news detection is a challenging problem in misinformation spreading, and it has tremendous real-world political and social impacts. Past studies have proposed machine learning-based methods for detecting such fake news, focusing on different properties of the published news articles, such as linguistic characteristics of the actual content, which however have limitations due to the apparent language barriers. Departing from such efforts, we propose FNDaaS, the first automatic, content-agnostic fake news detection method, that considers new and unstudied features such as network and structural characteristics per news website. This method can be enforced as-a-Service, either at the ISP-side for easier scalability and maintenance, or user-side for better end-user privacy. We demonstrate the efficacy of our method using data crawled from existing lists of 637 fake and 1183 real news websites, and by building and testing a proof of concept system that materializes our proposal. Our analysis of data collected from these websites shows that the vast majority of fake news domains are very young and appear to have lower time periods of an IP associated with their domain than real news ones. By conducting various experiments with machine learning classifiers, we demonstrate that FNDaaS can achieve an AUC score of up to 0.967 on past sites, and up to 77-92% accuracy on newly-flagged ones.
translated by 谷歌翻译
Verifying the input-output relationships of a neural network so as to achieve some desired performance specification is a difficult, yet important, problem due to the growing ubiquity of neural nets in many engineering applications. We use ideas from probability theory in the frequency domain to provide probabilistic verification guarantees for ReLU neural networks. Specifically, we interpret a (deep) feedforward neural network as a discrete dynamical system over a finite horizon that shapes distributions of initial states, and use characteristic functions to propagate the distribution of the input data through the network. Using the inverse Fourier transform, we obtain the corresponding cumulative distribution function of the output set, which can be used to check if the network is performing as expected given any random point from the input set. The proposed approach does not require distributions to have well-defined moments or moment generating functions. We demonstrate our proposed approach on two examples, and compare its performance to related approaches.
translated by 谷歌翻译
We propose AstroSLAM, a standalone vision-based solution for autonomous online navigation around an unknown target small celestial body. AstroSLAM is predicated on the formulation of the SLAM problem as an incrementally growing factor graph, facilitated by the use of the GTSAM library and the iSAM2 engine. By combining sensor fusion with orbital motion priors, we achieve improved performance over a baseline SLAM solution. We incorporate orbital motion constraints into the factor graph by devising a novel relative dynamics factor, which links the relative pose of the spacecraft to the problem of predicting trajectories stemming from the motion of the spacecraft in the vicinity of the small body. We demonstrate the excellent performance of AstroSLAM using both real legacy mission imagery and trajectory data courtesy of NASA's Planetary Data System, as well as real in-lab imagery data generated on a 3 degree-of-freedom spacecraft simulator test-bed.
translated by 谷歌翻译
The de facto standard of dynamic histogram binning for radiomic feature extraction leads to an elevated sensitivity to fluctuations in annotated regions. This may impact the majority of radiomic studies published recently and contribute to issues regarding poor reproducibility of radiomic-based machine learning that has led to significant efforts for data harmonization; however, we believe the issues highlighted here are comparatively neglected, but often remedied by choosing static binning. The field of radiomics has improved through the development of community standards and open-source libraries such as PyRadiomics. But differences in image acquisition, systematic differences between observers' annotations, and preprocessing steps still pose challenges. These can change the distribution of voxels altering extracted features and can be exacerbated with dynamic binning.
translated by 谷歌翻译
在本文中,我们为自主机器人提供了一种新型的模型预测控制方法,受到任意形式的不确定性。拟议的风险感知模型预测路径积分(RA-MPPI)控制利用条件价值(CVAR)度量来为安全关键的机器人应用生成最佳控制动作。与大多数现有的随机MPC和CVAR优化方法不同,这些方法将原始动力学线性化并将控制任务制定为凸面程序,而拟议的方法直接使用原始动力学,而无需限制成本函数或噪声的形式。我们将新颖的RA-MPPI控制器应用于自动驾驶汽车,以在混乱的环境中进行积极的驾驶操作。我们的仿真和实验表明,与基线MPPI控制器相比,提出的RA-MPPI控制器可以达到大约相同的圈时间,而碰撞的碰撞明显少得多。所提出的控制器以高达80Hz的更新频率执行在线计算,利用现代图形处理单元(GPU)来进行多线程轨迹以及CVAR值的生成。
translated by 谷歌翻译
我们介绍了第一个机器学习引力波搜索模拟数据挑战(MLGWSC-1)的结果。在这一挑战中,参与的小组必须从二进制黑洞合并中识别出复杂性和持续时间逐渐嵌入在逐渐更现实的噪声中的引力波信号。 4个提供的数据集中的决赛包含O3A观察的真实噪声,并发出了20秒的持续时间,其中包含进动效应和高阶模式。我们介绍了在提交前从参与者未知的1个月的测试数据中得出的6个输入算法的平均灵敏度距离和运行时。其中4个是机器学习算法。我们发现,最好的基于机器学习的算法能够以每月1个的错误警报率(FAR)的速度(FAR)实现基于匹配过滤的生产分析的敏感距离的95%。相反,对于真实的噪音,领先的机器学习搜索获得了70%。为了更高的范围,敏感距离缩小的差异缩小到某些数据集上选择机器学习提交的范围$ \ geq 200 $以优于传统搜索算法的程度。我们的结果表明,当前的机器学习搜索算法可能已经在有限的参数区域中对某些生产设置有用。为了改善最新的技术,机器学习算法需要降低他们能够检测信号并将其有效性扩展到参数空间区域的虚假警报率,在这些区域中,建模的搜索在计算上很昂贵。根据我们的发现,我们汇编了我们认为,将机器学习搜索提升到重力波信号检测中的宝贵工具,我们认为这是最重要的研究领域。
translated by 谷歌翻译